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D Abstract: The classification of events involving jets as signal-like or background-like can depend 

strongly on the jet algorithm used and its parameters. This is partly due to the fact that standard 

jet algorithms yield a single partition of the particles in an event into jets, even if no particular 

]^ choice stands out from the others. As an alternative, we propose that one should consider multiple 

—^ interpretations of each event, generalizing the Qjets procedure to event-level analysis. With multiple 

OS interpretations, an event is no longer restricted to either satisfy cuts or not satisfy them - it can be 

^\J assigned a weight between and 1 based on how well it satisfies the cuts. These cut-weights can 

then be used to improve the discrimination power of an analysis or reduce the uncertainty on mass 

or cross-section measurements. For example, using this approach on a Higgs plus Z boson sample. 



^' 

^f-\ with H —^ bb we find an 28% improvement in significance can be realized at the 8 TeV LHC. Through 

T-H a number of other examples, we show various ways in which having multiple interpretations can be 

^ useful on the event level. 
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1 Introduction 

Almost every event recorded at the Large Hadron Collider contains some number of jets. Sometimes 
the jets are the objects of interest, as in a search for dijet resonances. Sometimes they are indications 
of contamination and a jet veto can be used to increase signal purity. Even in events that are pre- 
dominantly electroweak some amount of jet activity is usually present. Techniques for analyzing jets, 
in particular, the substructure of jets, have been increasing in sophistication in recent years. Some 
recent reviews are [1-4]. 

To use jets for any sort of analysis, one first needs a way to translate the hadronic activity in the 
event into a set of jets. At the LHC, this is done almost universally with sequential recombination 
algorithms. These algorithms, such as the anti-fcy [5], Cambridge/ Aachen [6, 7], and kr [8, 9] algo- 
rithms, assemble jets by merging particles in a sequence determined by some fixed distance measure. 
The result of applying such an algorithm to an event is a tree containing a sequence of branchings. The 
jets resulting from running a jet algorithm represent the algorithm's best guess as to which particles 
should be associated with the fragmentation of the same hard parton. In this paper (unlike in [10]) 
we will only be interested in which particles end up in which jet, not the structure of the clustering 
tree. 



In the majority of cases, such as when there are a few, well-separated jets, the best guess interpre- 
tation from any algorithm provides an excellent representation of the event. However, for events with 
multiple and overlapping jets, the interpretations can differ greatly among algorithms, or even when 
the parameters (such as the jet size R) of a single algorithm are varied. Ideally, one would like to treat 
events which are sensitive to the jet algorithm or jet parameters differently from ones which are more 
robust to algorithmic variations. In this paper, we propose a way to consider multiple interpretations 
of an event at once. 

Intuitively it makes sense that considering multiple interpretations of an event should yield useful 
information. Indeed, probabilistic jet algorithms were first discussed long ago in relation to improving 
the behavior of seeded jet algorithms [11]. Other related approaches to jets include combining observ- 
ables to improve discovery significance [12-14], comparing multiple interpretations of jet reconstruction 
with models of showering in signal and background processes [15, 16], and measuring the "fuzziness" 
of jet reconstruction [17]. Here we consider how multiple interpretations of each event can be used to 
turn single observables (e.g. dijet invariant mass) into distributions. This idea was proposed in [10] 
and called Qjets. In [10] a proof-of-concept application of Qjets was given which focused on tree-based 
jet substructure. It was shown that Qjets can improve the statistical discriminating power in the 
search for boosted hadronically decaying objects. In this paper, we apply the multiple-interpretations 
aspect of the Qjets approach to jet reconstruction over a full event. 

The basic idea behind Qjets is to sample interpretations near what a traditional jet algorithm 
would give. During a clustering step, a traditional jet algorithm merges the two closest particles 
based on some distance measure. One possible way to sample interpretations around this standard 
interpretation is, rather than always merging the two closest particles, to merge two particles with 
some probability depending on how close they are. The result is a set of N interpretations of each 
event. 

There are a number of ways one can process these N interpretations. In [10], the iV trees con- 
structed from the particles in a single jet were pruned [18]. Pruning throws out some particles based 
on the branching sequence in the tree. Since the pruned trees have different particles for each tree, 
the jet properties are different. For example, since N different jet masses result one can look at the 
width of the mass distribution for a single jet. This width, called volatility in [10], was shown to be a 
useful discriminant between signal and background jets in certain cases. 

In this paper, we apply the multiple interpretations idea of Qjets to an entire event, and we do 
not apply pruning (or any other grooming procedure). Instead, we exploit the fact that different 
clusterings will give jets with different 4-vectors. For example, if a particle is halfway between two 
jets, it might get clustered into each jet half the time. Or a particle which classical anti-ZcT clusters 
with the beam now has some probability to be clustered into a jet. With multiple interpretations, 
particles can be associated with many different jets, in contrast to classical algorithms where each 
particle is always associated with exactly one jet. 

The result of applying Qjets to an event is a set of N interpretations of that event. One way to 
process these interpretations is to apply some cuts to them, as one would in a classical analysis. For 
example, one can impose a dijet mass window cut or a pt cut. While with a classical algorithm, an 
event would either pass or not pass the cuts, with Qjets, a fraction z of the interpretations pass. We 
call z the cut- weight. Events with z close to 1 are then very likely to be signal, while events with 
z close to zero are unlikely to be signal. Although one can try cutting directly on z (similar to cuts 
on volatility in [10]), it is better to use z to compute a statistical weight for a given event. That is, 
instead of throwing events out, each event is weighted by how signal-like it appears. Then one simply 
constructs the distribution of, say, the dijet invariant mass, with each event weighted by its z-value. 



The statistical fluctuations on this weighted invariant mass wiU be smaUer (often much smaUcr) than 
if z = or 2: = 1 are the only possibilities considered (as in a classical analysis). We will show that 
using weighted events in this way can provide significant improvements in the size of a signal divided 
by the characteristic background uncertainty, S/SB, for many event classes. 

In this paper we consider 4 processes: 1) Z + H with H -^ bb, 2) a heavy scalar (j> produced 
in association with a Z boson with cf) — > dijets, 3) 1 TeV dijet resonance, and 4) a heavy scalar 
decaying to 2 other scalars which each decay to dijets. In cases where the event topology is simple and 
unambiguous, for example when there are two well separated jets, we find that standard algorithms 
perform quite well and the use of multiple interpretations only provides a marginal improvement. 
However, in more complex cases where events have jets with potentially overlapping boundaries, using 
the multiple interpretations can substantially improve significance over standard cut-based analyses. 
As an important example, we find a 28% improvement in S/6B for Z + H over its Z + bb irreducible 
background. 

This article is structured as follows. In Sec. 2 we will present a modification of the anti-fc^ 
algorithm to make it non-deterministic. Modifying the jet algorithm in this way generates a Monte- 
Carlo sampling of the distribution of interpretations around the best-guess interpretation. Some ways 
to visualize the effect of multiple interpretations are presented in Sec. 3 and 4. In Sec. 5 we derive 
a formula for the statistical significance using our method. Sec. 6 applies the algorithm to several 
samples of phenomenological interest. Some comments on the speed of the algorithm are given in 
Sec. 7. Conclusions are in Sec. 8. 

2 Qanti-fc^: a non-deterministic anti-fc^ algorithm 

We begin by describing how the anti-Zcy algorithm can be modified to provide multiple interpretations 
of an event. While one would ideally sample every possible reconstruction of an event, collider events 
typically contain a large number of final state particles so this is impractical. Instead we generate a 
representative sample of interpretations by using a Monte-Carlo integration type approach. A fastjet 
plugin with and implementation of this Qanti-fc^ algorithm is available at http:// jets. physics, 
harvard. edu/Qantikt. 

The Qanti-fcy algorithm works as follows. The input is a set of 4- vectors representing each particle's 
4- momenta. These can be the stable hadrons in an event, charged tracks coming from a primary 
interaction, calorimeter cells, topoclusters, or the output of a Monte Carlo. 

1. First calculate the distances dij between each pair of 4- vectors and also the distances diB between 
each 4- vector and the beam. The metric used for the distance calculation is that of anti-fcr, 
although the procedure can easily be modified to work with C/A or kx- The anti-fcy distance 
measure is 

d,j ^ min (^p„ ,pj.^j -^, (2.1) 

and 

d^B^Prh (2-2) 

where ARij = \J{yi — yj)^ + {4>i — (p/)^ is the angular distance between a pair of 4-momenta 
i and j with y the rapidity and (f> the azimuthal angle, i? is a free parameter in the anti-fcr 
algorithm, representing the size of the final jets one is interested in. 



2. A weight is then assigned to each pair: 

ujl^'=exp^-a^^ 1' (2-3) 

where d""" is the minimum distance over all pairs at this stage of the clustering and a is a real 
number called rigidity in [10]. 

3. A random number is used to choose a pair to merge. The probability of merging a given pair is 

P^f = y^ (2.4) 

4. Repeat until all particles have been merged into jets or a beam. 

At its heart, the Qanti-/cT algorithm is still a sequential recombination algorithm. However, the 
weights and their Monte-Carlo sampling modify the order of the merging and change which particles 
get clustered into each jet or the beam on each iteration. In a traditional sequential recombination 
algorithm the jets closest in distance are merged first. In Qanti-fc^, jet-jet or jet-beam distance is 
assigned a weight, controlled by a parameter a, which allows the recombination order to vary. For 
a given event we find it is typically sufficient to repeat the Qanti-fcr procedure a few tens of times 
at the same value of a for our results to stabilize^. The result can be thought of as a Monte-Carlo 
calculation of the distribution of interpretations around a best guess. 

When a = 0, all distances are given equal weighting, which means that particles far apart could 
be merged into the same jet early in the clustering. Somewhat surprisingly, despite the random 
clustering, in [10] it was found that Qanti-fcr can still distinguish signal from background even when 
a = 0, although that will not be the case here. As a increases in value clusterings closer to those 
of anti-fcT' have higher weights and are consequently more likely to be realized. One can think of a 
somewhat like ^ . In analogy with with the /i — >■ limit of quantum mechanics we term the a — > cxd 
limit the classical limit. In the classical limit the pair of particles closest in distance is always merged 
and the diversity of interpretations is lost. 

In addition to a, the jet radius parameter R can also be varied. For finite a the final jets are no 
longer circles of radius R as they are in classical anti-fcy. Indeed, with Qanti-fc-T: there is no longer 
even a precise notion of where the jets are. This can be seen in Fig. 1 below. As a result, there is less 
sensitivity to the precise choice of R when using Qanti-A:^ than when using classical algorithms. This 
speaks to the general trends observed in [10]: with Qjets results depend much more weakly on the jet 
algorithm and algorithm parameters than with classical jets. 

It is worth pointing out that the Qanti-Zcj- algorithm is infrared and collinear (IRC) safe. IRC 
safety means that when an arbitrarily soft particle is added or a particle is split into two particles 
in the same direction, the results are unchanged. Qanti-fcr is IRC safe as long as a > 0. To see 
this, first note that all sequential recombination algorithms are by their nature infrared safe - any 
infinitesimally soft emission will simply be clustered with harder radiation during the recombination 
process and will thus have no effect on the final outcome. For collinear splittings, one might worry 
that when non-determinism is added to the clustering, if a particle is split in two, the two halves might 
be clustered differently. However, note that dij ex (Ai?)^ (see Eq. 2.1), and therefore dij between two 
particles which are exactly collinear will be exactly zero. When this happens, d™'" = as well and so, 
for a > 0, Pj" = 1 for collinear particles and P°!j = otherwise. Thus, collinear particles will always 
be clustered before non-collinear ones, and collinear particles will always end up in the same jet. 



^In this paper we will always run Qanti-ZcT N = 100 times per event. 
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Figure 1: The top- left panel shows the 77 x plot of a simulated pp — >■ 00 — >■ gggg event at the LHC, 
with 777.0 = 500 GeV. The top middle panel shows the jet areas associated with the four jets which 
best reconstruct the event using the classical anti-fcr algorithm (see Sec. 6.4). The colors show the 
detector elements where zero-energy ghost particles would get clustered into each jet. The remaining 
plots show the frequency with which a cell is clustered into one of the four jets which best reconstruct 
each event for different choices of a. Blue squares indicate a cell is nearly always included amongst 
the four hardest jets, green squares indicate that the cell is included roughly half the time, while pink 
indicates a cell is only rarely included. The same event is shown in all plots. 



3 Overlapping jets and jet area 

Before applying Qanti-fc^ to a signal/background discrimination task, we can explore how it differs 
from classical algorithms. An advantage of Qanti-ZcT is that particles are not always clustered into the 
same jets. This is particularly useful in contexts where jets overlap. With overlapping jets, classical 
algorithms must assign each particle to exactly one jet. But Qanti-fcT can split the particles into each 
jet some fraction of the time.^ 

■^A note on our sample composition: we generate our signal and background events using a combination of Madgraph 
v5.7 [19] and Pythia v6.4 [20]. All events were generated assuming a 8 TeV LHC. We group the visible output of Pythia 
into massless 5r] X 5(f> = 0.1 X 0.1 massless cells with \ri\ < 5. Each type of event is analyzed with both Qanti-fcy and 
also standard anti-ZcT for comparison. We use Fastjet v2.4.2 [21] to generate the standard anti-ZcT results. 
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Figure 2: The jet area computed using Qanti-/cT for various choices of the rigidity parameter a. 
Shown is the area of the hardest jet va. (f> ^ gg dijet events with m^ = 1 TeV using R = 1.1. 



To see how Qanti-fcT- handles overlapping jets, consider the four-jet event shown in Fig. 1. This 
event is pp — > (/)0 — > gggg at the parton level, a process examined in Sec. 6.4. In order to demonstrate 
that particles between jets can get clustered into different jets, we show what happens when ghost 
particles are added to the event. Ghost particles were introduced in [22] as a way to characterize 
the area of a detector to which a jet is sensitive. Ghost particles are zero energy particles scattered 
throughout the acceptance region. Since they have zero energy, they do not affect the location or 
4-momentum of the final jets. The top-middle panel of 1 shows the areas associated with the four jets 
which best reconstruct the event using classical anti-fcx (see Sec. 6.4). This panel is similar to the 
bottom right panel of Fig. 1 of [5]. 

The remaining panels in Fig. 1 show the frequency with which individual cells are clustered into 
the four jets which best reconstruct the event using Qjets for various a. We see that for small values 
of a there is little well defined structure to the event, while for a = 0.1 we begin to see jetty areas 
of activity with amorphous borders. Finally, for larger value of a we begin to resolve the standard 
anti-ZcT circular jet shapes. Note in particular from the a = 10 panel that there are five jets relevant 
in this event - there is no clear choice between which four should be used in the reconstruction. This 
is precisely the sort of ambiguity which the multiple-interpretations approach can efficiently exploit. 

One can be more quantitative about the area clustered into each jet using the jet area proposed 
in [22]. In a classical algorithm, this is the area of the detector clustered into a given jet. With Qjets, 
the area varies for each clustering. Thus the jet area becomes a distribution. This distribution is 
shown in Fig. 2, averaged over many events for R = 1.1. Jet area using the classical anti-fcT^ algorithm 
would give a ^-function at area = ttR^ = 3.8. One can see this being approached at large a. For 
a = 1.0,0.1 or 0, the area is much broader. Thus, with Qanti-fc^, the jet area can be either larger or 
smaller than what comes from using classical anti-fcy. 
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Figure 3: z is defined as the fraction of interpretations of an event satisfying a set of cuts. Shown 
is the distribution of z for signal {H + Z events, hollow, blue) and background {Z + bb events, solid, 
red) for various a. The cuts used to calculate z are 110 GeV < mjj < 140 GcV and pr > 25 for 
each jet. Top-left shows the classical case, where an event either satisfies the cuts z = or it does 
not. Distributions are normalized to area 1. These normalized distributions are the functions p{z) 
discussed in Sec. 5. 



4 Cut-weights 



Once one generates N clusterings of each event using Qanti-fcy, the clusterings can be used to improve 
the statistical significance in an analysis. In the context of a search, combining multiple interpretations 
can be used to improve the S/6B (the signal size divided by the characteristic background uncertainty) 
compared to a standard jet algorithm. Alternatively, the uncertainty on a mass, cross section, or 
branching ratio measurement from a given sample can be reduced. In this paper, we focus on improving 
S/SB. 

Suppose one decides on a set of cuts which optimally distinguish signal from background for a 
particular classical analysis. For example, in searching for H + Z events with H ^ bb one might like 
to cut on the invariant mass of the bb pair. Whatever the cuts are, classically an event either passes 
those cuts or docs not pass them. With Qanti-fc-r, a fraction z which we call the cut-weight, of the 
events can pass the cuts. 

To get a feel for what the cut-weight distributions look like, we show signal and background 
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Figure 4: z is the fraction of interpretations of an event which satisfy the cuts, as in Fig 3. The 
2D distribution of z as a function of the classical dijet mass mjj is shown for some values of a for 
signal and background. Every event gives a value of mjj and a value of < z < 1. Thus integrating 
over z reproduces the classical mjj distribution, as shown in the bottom right. In the classical limit 
(a — ;> cxd), information from multiple interpretations is inaccessible. 



distributions of z in Fig. 3. Here, signal is H + Z events with H ^ bb and background is Z + bb 
events. We demand that pf > 120 GeV for all events, imagining Z decays to neutrinos and this is 
a missing transverse energy cut. The cuts by which z is determined are that the two hardest jets 
should have px > 25 GeV and that the dijet invariant mass of the two hardest jets is in the window 
110 GeV < mjj < 140 GeV. 

In the classical limit {a = cx)), we see that z = or z = 1 only. That is, an event either satisfies 
the cuts or it does not. For smaller a, say a = 1, there is a substantial fraction of the events for 
which only some of the clusterings satisfy the cuts. Note that more signal events pass the cuts than 
background events. For a = where the clustering is random, no more than half of the interpretations 
pass the cuts. 

As another way to visualize the value added by cut-weights, we show in Fig. 4 how z changes 
for events with a given classical dijet invariant mass. In a classical analysis, one can look at the 



distribution of mjj for signal and background and put a cut to optimize significance. Such a cut 
corresponds to a vertical band in these plots. Because the distribution of z is different for signal and 
background events with the same value of nijj it will help to incorporate z into the analysis. Applying 
a 2 dimensional cut on mjj and z provides around a 6% improvement (for a = 0.1) in S/y/B over 
a classical (vertical) cuts on the same event. Combining mjj with z using both a = 1 and a = 0.1 
using boosted decision trees gives around a 7% improvement. However, cutting on z is not ideal since 
one is still throwing out events instead of weighting the less signal-like events less. We discuss next 
how to compute the significance using weighted events. 

5 Statistics 

The fraction z of events passing a set of cuts provides a weight for each event based on how many 
interpretations of that event resemble signal according to some measure. Thus it is natural to use 
these weights directly in the calculation of the significance. In this section we discuss how this can be 
done. The procedure we describe here was used in [10] and is discussed in more detail in [23]. 

If one knew what the signal and background distributions should look like exactly, the optimal 
significance would be achieved by using something like a likelihood test. In practice, we never know 
exactly what signal and background should look like. Thus using likelihood ratios can be prone to 
picking up on pathological regions of simulations. Moreover, it can be extremely challenging to calcu- 
late the systematic uncertainty on likelihood-based significance estimates. Cuts provide a compromise 
where the simulation does not have to be perfect and the systematic uncertainty can be estimated 
more reliably. Multiple interpretations through the Qjets approach provides a method for combining 
some of the advantages of both the cut-based and likelihood-based approaches. By using the fraction 
of interpretations in a window as a cut, one knows explicitly what regions of phase space are contribut- 
ing (as in a cut-based approach). However, since events that are more signal-like contribute more, the 
significance of an excess will be greater for a given luminosity than using a cut-based approach alone. 

5.1 Significance 

To quantity the improvement from our procedure we adopt as a measure the excess number of events 
measured S divided by the expected fluctuations in the background SB. That is, 

a AT _ (xrtikg 

.r, J -'^observed ^^cxpoctcd /ri\ 

sigmhcance — - — ~ n (5.1) 

For example, suppose we see S = 100 excess events in some channel which a Higgs boson could con- 
tribute to. If the background was only expected to fluctuate by 6B = 20 events, then the significance 
is S/6B — 5, which conventionally characterizes a discovery. That is, in order to replicate the ob- 
served number of events, the background without signal would have had to have fluctuated by 5 times 
more than SB. To calculate the significance with data, one needs to know the mean and variance of 

^bkg 

expected ' 

A key feature of Qjets is that events are not characterized as signal or background (e.g. by passing 
some cuts or not passing them). Rather they are assigned a weight z between and 1 based on how 
many interpretations of the event are signal- like (according to some measure). Thus the measured 
number of signal events S no longer has to be an integer. Moreover, the fluctuations in the background 
are now the fluctuations in a non-integer number. 

The practical procedure we propose is very simple: count the number of events passing a set of 
cuts weighted by z. That is, define iVobserved as the sum over z for each event (rather than counting the 



number of events with z = 1). In order to decide if this number is consistent with a background-only 
hypothesis, one needs to know the expected fluctuations in this weighted number. We now describe 
how the expected size of fluctuations can be easily computed. We first review how the expected value 
and variance of B are computed in a classical analysis and then describe how cut- weights can improve 
significance. 

5.2 Classical cut-based significance 

Suppose we are looking for a particular signal (like a Higgs boson) in a classical analysis and we design 
a set of cuts to optimize the discovery potential. Once the cuts are set, we can focus on the background 
expectation and fluctuations, since these will determine the significance of an observed excess. Let us 
say with a given luminosity that we expect N background events of a particular type to be produced. 
Let us say a fraction e of these background events are expected to pass a set of cuts. We call e the 
reconstruction efficiency. Thus, in the absence of signal, we expect Ne events. We would next 
like to know what the expected variance is around this mean. There are two contributions to the 
fluctuations about the mean: from the inherent quantum mechanical Poisson process which produces 
the events in the first place, and from the fact that any individual event has some probability of 
satisfying our cuts. 

The production rate is governed by a Poisson distribution. If we expect N events, the probability 
of producing n events instead is 

P{n\N) = -^^ (5.2) 



This Poisson distribution has mean A^ and standard deviation u = viV. The variance is a'^ — N . 

Now consider the reconstruction efficiency. Say our background events pass our cuts a fraction e 
of the time. For example, for the samples shown in Fig. 3, we can see from the top-left panel (the 
classical case) that e^ = 0.12 for background and es — 0.55 for signal. Suppose there is only signal. 
If n signal events are produced, what is the probability of finding a events passing our cuts? It is not 
hard to see the this probability is given by a weighted binomial distribution: 

i?(aKe)=Qe'^(l-6)"-'^ (5.3) 

This distribution has mean en and standard deviation 

ar, = y/ne{l - e) (5.4) 

To describe the full process, where n events are observed from an expected N events and of that 
n, a events are reconstructed correctly, we combine the two probability distributions and sum over the 
intermediate variable n. For example, we can ask what is the probability of finding 5 events passing 
our cuts when we expect 100 to be produced? We have to sum over the probability of reconstructing 
5 events from every possible value of the number of observed events, which can range from 5 to oo. 
This can be expressed as: 

oo 

P{a) = ^[P{n\N)-B{a\n,e)] (5.5) 

n— a 

This distribution has mean eN , as expected, and variance a^ — Ne. Thus the uncertainty in the 
number of background events measured is 

6B = ^NeeB (5.6) 
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The significance is then S/5B — Ns^sl \/Nb^b- 

In summary, the uncertainty associated with the number of events gets a contribution from the 
Poisson nature of the production process and another contribution from the uncertainty on whether 
an event will pass our cuts. When both uncertainties are combined the mean and variance of the 
expected number are both cN . 

5.3 Weighted cuts with Qjets 

A trivial observation which simplifies the uncertainty calculation for weighted events is that, since each 
event is independent, the probability that a of n events will pass a set of cuts is completely determined 
by the probability that one event will pass the cuts. This is true both for classical algorithms which 
produces weights z = or 1 and algorithms which combine multiple interpretations, like the pruned 
Qjets algorithm used in [10] and Qanti-fcx described here. We start by rewriting the classical case 
calculation in terms of single event probabilities, then discuss how the calculation is modified for 
weighted events. 

The cut-weight z denotes how signal-like a single event is: z = \ is very much signal (by some 
measure) and z = is very much background. We can then define a function p{z) which gives the 
probability that an event passes the cuts. For the classical analysis, an event can only have z = 1 
(signal) or z = (background). Thus this probability function in the classical case is 

Pciass.(2) = (1 - e)5{z) + e5{z - 1) (5.7) 

which matches the classical anti-fcr panel of Fig. 3. 

What is the probability for a single event to pass a set of cuts? We can compute this either using 
Eq. (5.3) with a = 0, 1 or with Eq. (5.7) integrating over z. The two methods agree: 

(z) = dzz pciass. (^) = <^ = X! a^("l'^ = I7 e) (5.8) 

Similarly, we find 



(z') 



= dzz'^ pcUssXz) =e = ^a'^B{a\n = l,€) (5.9) 

Thus if we know that exactly one event is produced, we find 

^?,ciass = (^') - {zf = e(l - e) (5.10) 

as in Eq.(5.4) with n — 1. 

To get the expected variance on the full distribution, we have to include the Poisson uncertainty 
which depends on the mean (z) — e. By the central limit theorem, since the events are uncorrelated, 
the characteristic uncertainty on the distribution where N events are expected is 



"class = \/^('^U.s+(^)')=^^^ (5-11) 

in agreement with Eq. (5.6). 

With cut-weights < z < 1, would also like to know what the probability is that a events pass 
our cuts if n events were produced at the collider which was expected to produce N events. The new 
feature in the Qanti-fcr case is that a and z are not necessarily integers. With Qanti-fcT, each event 
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is interpreted multiple times. For each event, a fraction z of the interpretations pass the cuts and a is 
the sum of the z values over all the measured events. In the Qanti-fc^ case the function p(z) now has 
meaning for < z < 1. Examples of p(z) are shown for various a are shown in Fig. 3. 

Although the different interpretations coming from Qjets for the same event are highly correlated, 
each event is uncorrelated with any other. Thus, as with the classical case, the probability of finding a 
events satisfying the cuts when n events are produced is completely determined by the probability that 
one event will satisfy the cuts. That is, we do not need to know what the generalization of i?(a|n, e) 
is in Eq. (5.3), only that it is determined completely by p(z). 

We calculate the uncertainty with weighted events exactly as we did in the classical case with p{z) 
replacing Pciass. (-z)- That is, we calculate 



{z)p^ dzzp{z) (5.12) 

and 

dzz^p{z) - {z)l (5.13) 

Then if N events should have been produced, the expected number to be observed is 

iVexpcctcd = N{z)p (5.14) 

The uncertainty on this number is 

'^Qjcts = ^N {al^ + (z)2) = Vn^/{^ (5.15) 

which is just Poisson fluctuations multiplying the root-mean-square (RMS) of the distribution. 

In summary, to use weighted events, instead of counting an event as either satisfying a set of 
cuts (z = 1) or not satisfying them (z = 0), an event can fractionally satisfy them, giving a weight 
< z < 1. Then the number of observed events is the sum over these z values over all events. For 
a signal process, this number is written as S* = Ns{z)p^ where Ps{z) is given by the cross section for 
getting a z value of a signal process, normalized to unit area and Ns is the total number of signal events 
considered. For background, B = Nb{z) pg The characteristic size of fluctuations of B is given, in the 
limit of large number of events where the central-limit theorem can be applied, by SB = ^J Nb{z'^) pg- 
So that 

significance — — p^ ^ (5.16) 

^Ns{z^)pg 

To see how much cut-weights can help, one can take the ratio of this value to the cut-based significance. 
The overall number of signal and background events considered, Ns and Nb, conveniently drop out 
of such a ratio. 

5.4 Reweighting 

The procedure we have described can be applied for any way of computing weights. Using multiple 
interpretations to generate the weight z is natural and intuitive. As a simple generalization, one can 
consider transforming the weight by any function t{z) to see if significance can be improved. The 
optimal function will be the one that produces an extremum of the functional 

significance^ ^ -^^ = fyzt{z)ps{z) ^^^^^^ 

^^^'^"^ ^!ldz[t{z)YpB{z) 
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observables 


cuts 


cut-weighted 


cut-reweighted 


mjj 


1.00 


- 


- 


a = 


1.00 


0.79 


0.82 


a = 0.1 


1.01 


1.19 


1.24 


a = 0.3 


1.00 


1.22 


1.28 


a = 1.0 


1.02 


1.18 


1.24 



Table 1: Comparison of the significance using the cut-based, cut-weighted, and reweighted methods. 
The mjj window used 110 GeV < mjj < 140 GeV taken from and the significance of this cut is 
normalized to 1. The mass window has not been optimized (optimizing it on our samples leads to 
104 GeV < mjj < 136 GeV and gives a significance of 1.03). This same 110 GeV < mjj < 140 GeV 
window is used to compute the weight functions p{z) for signal and background. Cuts refers to the 
number Ns/\/Nb of events in a window, cutting on the ps{z) and pb{z) distributions as well as 
mjj in the Qanti-kT cases. "Cut-weighted" refs to using Eq. (5.16) and "cut-reweighted" refers to 
using Eq. (5.17) and (5.19). All numbers are for the same Z + H sample (signal) and Z ~\-bb sample 
(background), as described in Sec. 6.3. 



The functional variation of the significance is 



5significance[t] Ps{z') t{z')pB{z'){t)p^ 



5t[z') 



it' 



^l/2 



m%' 



This vanishes when 



i(z) = 



Ps{z) 



(5.18) 



(5.19) 



Psiz) 

up to an overall constant which has no effect on the significance enhancement. 

To use these results in practice, suppose we are interested in how much luminosity it would take 
to see a certain signal over a certain background. We first compute the expected numbers Ns and 
Nb of signal and background events produced at the collider for a given set of cuts. Given these cuts. 



we can calculate ps{z) and pb{z), as in Fig. 3. Thus functions give us 



/PS 



and 



/Pb 



I Pb- 



(as well as 
The expected significance is given by S/5B. With data, one could just look 



if we want to use reweighted events) . We then calculate S = N[ 



expected 



bkg 
expected' 



{t)pg and (t 

and 5B = ^JNb{z 

for an excess over expected background. Then S would be replaced by A'obscrved — N^ 

As a comparison of the cut-based, cut-weighted, and reweighed approaches, we give the expected 
significance for each method in Table 1 for the Z + H signal and Z + bb background samples. Note 
that since S/SB scales as v£ (the square root of the luminosity), an improvement in S/SB of 28% 
means that one can make measurements with a significance comparable to standard anti-fcr using only 
(rSs)^ = 61% of the luminosity. On the other hand, since S/6B is proportional to Ns/\/Nb, for any 
p one can compare the significance for different algorithms and cuts independent of the expected cross 
section and luminosity. 



6 Example applications 

In this section we show how Qanti-fcT can be useful for a variety of searches. We will consider 
three signals, listed here in ascending order of complexity: (1) a resonance decaying into dijets, (2) 
a resonance produced in association with a vector boson (including the H + Z example), and (3) 
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Figure 5: A comparison of tlie signal (left) and background (right) dijet invariant mass distributions 
using standard anti-ZcT and Qanti-fcr for optimized parameters. Signal is Zcj) — >■ viygg with m^ = 
500 GeV and background is Zgg — ^ i^i^gg- AH events have ^j, > 800 GeV and pr > 400 GeV for each 
jet. 

pair production of two resonances. We will see that while Qanti-fc^ does little to improve ordinary 
dijet reconstruction, the significance of more complex events can be improved by 50% over a classical 
analysis. 

For each event class, we first process the signal and background events with classical anti-fcy at 
various different values of R. We then fix the value of R which optimizes the S/SB ratio (which for 
classical anti-fc^ is simply NsCs/V^b^b)- We then use this value of R in Qanti-Zc^ and compute 
S/SB for different values of rigidity (a). Qanti-fcp is useful to the extent that S/SB is larger than 
S/5B for the classical analysis. How S/6B is computed in Qanti-fcr was discussed in the previous 
section. Results are summarized in Table 2. 

6.1 Simple resonance reconstruction 

We consider first a dijet resonance decay to gluons. The signal process is pp ^ 4> ^ gg with m^ — 
1 TeV. The background is dijet production in the standard model. We consider the two hardest jets 
in each event, requiring both jets to satisfy Pt{J) > 425 GeV and the diet mass to be in the window 
950 GeV < m < 1050 GeV. For this process and these cuts, we find that i? = 1.1 gives the best S/SB 
ratio using classical anti-fcr 

Running Qanti-Zc^ on these samples, we find at most a 3% improvement (see Table 2). That we 
find only a small improvement is perhaps not unexpected in this case. With hard well-separated dijets, 
any algorithm and most jet sizes should be able to pick out the dijets and get their invariant mass 
mostly correct. Since there is little ambiguity in the events' interpretation, there is little to gain from 
resampling with Qanti-ZcT. 

6.2 Boosted resonances in associated production 

Next we consider the case where a neutral scalar is produced in association with a Z boson. Unlike 
in the pure dijet case considered above, when the Z boson and resonance have significant transverse 
momentum, the jets from the resonance decay will to be closer together and of unequal px- Thus, 
there will be more ambiguity about whether or not the jets pass the pt cut. For systems with a larger 
boost there will an additional ambiguity due to overlap between the jets. 
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First we consider the process pp — > Z(f) — > ggvv where m^ = 500 GeV. The background is 
pp ^f Z + dijets. We require ^rp > 400 GeV, that the two hardest jets satisfy pt{J) > 200 GeV, and 
that the dijet invariant mass faU within the window 450 GeV < m < 550 GeV. Here we find that 
the classical value of R that optimizes S/SB is 0.95. Running Qanti-fcy on this sample, we find a 9% 
improvement in S/5B at a = 1.0. 

That the improvement is larger in this case than without the boost is consistent with the intu- 
ition that Qanti-fc^ helps more when the interpretation of an event is more ambiguous. For boosted 
resonances, the jet boundaries are close together. A classical algorithm, which only takes one interpre- 
tation of the event could easily assign radiation to the wrong jet. With 100 different interpretations 
of an event, some fraction of those interpretations will more correctly reconstruct the two jets than 
the classical algorithm 

Considering the same 500 GeV scalar but going to higher boost, Qanti-fcT helps even more. We 
next require ^j, > 800 GeV and pT > 400 GeV for each of the jets. This selects the events where the 
jets are even closer together. Here the optimal R value is found to be 0.65. Using this value of R, we 
find that with a = 0.1, Qanti-fcr produces a S/SB 19% larger than in the classical case. 

We show the distribution of mjj for signal and background for the classical and Qanti-fcr samples 
in Fig. 5. In the classical case, each event contributes a single value of mjj. For Qanti-fc^, each event 
contributes many (100 in our samples) values of mjj. Although the Qanti-fcy mass peak is broader 
for signal (so that S goes down) the improvement in the background stability (so that SB goes down) 
provides sufficient compensation so that S/6B goes up overall. 

6.3 Higgs +Z 

The boosted resonance analysis can be applied to Higgs boson production. Although boosted Higgs 
production can be considered with jet substructure methods [24], these methods require the boost 
to be so large that the Higgs decay products merge into a single fat jet. For Qanti-Zcr, the boost 
does not have to be so extreme. In fact, unlike substructure techniques, Qanti-fc^ will never degrade 
significance (although it sometimes will not help much) as long as a is optimized, since a = oo reduces 
to the classical case. 

We consider H + Z production where the Z decays to neutrinos and the Higgs decays to a 6-quark 
pair {ZH — >■ i/Dbb). As background, we take Z + bb production without a Higgs. We require that 
events yield at least two jets with px > 25 GeV, firp > 120 GeV, and that the invariant mass of 
the hardest two jets fall within the window 110 GeV < m < 140 GeV. The optimal R value for the 
classical analysis in this case is 0.7. Taking R = 0.7 we find that with a is optimized with a — 0.3 
the S/SB improves by 22% using the weighted-cuts approach and by 28% if we reweight by ps/pB as 
discussed in Sec. 5.4 (see also Table 1). 

28% is a substantial improvement in significance for a.n H —^ bb channel. Indeed, a classical 
multivariate approach involving a sinkful of kinematic and substructure variables [14] was only able to 
achieve improvements of significance of order 20%. Moreover, the pt cut of 120 GeV (which can easily 
be lowered) is not as extreme as the 200 GeV cut proposed in [24], thus more signal events can enter 
the Qanti-fcx analysis than the boosted one. This at least suggests that the multiple-interpretations 
approach warrants more detailed study for Higgs searches. 

6.4 Resonance pair reconstruction 

Next, we consider four-jet events to test how well Qanti-fc^ works in a more complex jet environment. 
We consider the process pp — > (j)(p, (f> — )■ gg, where (p is again a neutral scalar with m^ = 500 GeV 
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Figure 6: The two resonance masses in tlie pp — >■ cfxp process found for 100 interpretations of a single 
signal event using Qjets. From top left going clockwise, a = 0.1, a = 1, a = 10, and a — 100. We 
see that while the a — > cx) interpretation of the event does not fall within the mass window, such an 
interpretation arises when a is relaxed to ^ 1 and below. 

(see Ref. [25, 26] for similar analyses at ATLAS and CMS). The background in this case is four-jet 
production in QCD. 

In analyzing the four-jet events, while the core Qanti-fcT algorithm remains unaltered we add a 
preselection step to speed up the analysis (cf. Sec. 7) . In the preselection step we run both signal and 
background events through anti-fcr using Fastjet with R = 0.5. We then check to see if each of the 
four hardest jets in each event have pt > 120 GeV. Only events passing this cut are passed through 
to our non-deterministic anti-fcj^ algorithm. 

Each interpretation of each event using Qanti-Zc^ (or the single classical interpretation) gives a set 
of jets. Our goal is to select from these jets the four that yield two pairs which are close to each other 
in mass. In order to do this, we select the five hardest jets from the final set of jets, form all possible 
pairs, and calculate the invariant mass for each pair. For each two pairs a and b (representing the 
reconstructed scalars), we calculate the quantity \ma — mi,\/{ma + mi,) to evaluate how close in mass 
the reconstructed scalars are. We choose the pairing that minimizes the mass difference between the 
two reconstructed scalars. Once the pairing is chosen, we further require that: 

• The mass difference between the two reconstructed scalars is less than 20%: \ma — mb\/{ma + 



16 



Sample 


R 


anti-fcT^ 


IiTLprovement 
a = a = 0.01 


in S/SB (%) 
a ^0.1 a = l 


a = 100 


pp^<p 


1.10 


1.0 


0.14 


0.77 


0.89 


1.03 


1.01 


pp^ (j) + Z{K) 


0.95 


1.0 


0.64 


0.99 


1.07 


1.09 


1.01 


pp-^ <f) + Z{B) 


0.65 


1.0 


0.58 


0.98 


1.19 


1.10 


1.01 


pp ^f h + Z 


0.7 


1.0 


0.79 


0.99 


1.19 


1.18 


1.01 


pp^ (f> + 4> 


0.75 


1.0 


0.75 


1.43 


1.49 


1.40 


1.01 



Table 2: The improvement in S/SB compared to standard anti-fcy for various processes using different 
values of a, the rigidity parameter, pp — )• (j) + Z{A/Ji) denote the cjj + Z processes with a missing energy 
cuts of 400 and 800 GeV, respectively. The value of R used in both standard anti-fc^ and Qanti-Zcy is 
the one which optimizes the standard anti-fc-r results. The largest improvements are shown in bold. 

rrib) < 0.2 

• The average mass (rria + mi,) /2 of the two reconstructed scalars fall within the window 450 — 
550 GeV. 

• Each jet used to reconstruct the scalars must have pt > 120 GeV 

An example distribution of ma vs rrih for a single event is shown in Fig. 6. We see that the 
classical analysis {a ^ 100) does not find TOq ~ m^ = 500 GeV which would correspond to perfect 
reconstruction. The distribution of TOq and m^ for finite a shows that many masses can be sampled. 
More importantly, we see that some samplings come very close to the perfect reconstruction. This 
shows why Qanti-Zcy will be helpful for this multijet sample. 

This procedure is applied first to the classical analysis. We find that R ~ 0.75 maximizes S/6B 
in the classical case. Using this value of R, the S/SB improvements using Qanti-fcy on the same 
signal and background events at different values of the rigidity parameter a are shown in Table 2. 
We see that at a = 0.1 there is a 49% improvement in S/SB over the classical results. As with the 
previous cases, when a approaches higher values such of 10 and 100, the improvement declines as the 
algorithm begins to behave more like the classical algorithm. At very low values of a the performance 
of Qanti-fcy is poor. Again this is expected due to the highly random nature of the mergings at low 
a with little physical motivation 

The large improvement (49%) in significance achievable with Qanti-fcT over the classical analysis 
is consistent with our expectation that Qanti-fcr helps more in more complicated event topologies. In 
this case, having four jets rather than two makes the jets more likely to overlap and Qanti-fc^ is more 
likely to be helpful. 

7 Speed 

Unfortunately, adding non-determinism to a jet algorithm and running it 100 times can slow down 
an analysis significantly. You might expect that running something 100 times (with no optimization) 
should take at worst 100 times the amount of time it takes to run it once. But actually, our algorithm 
can be even slower. The reason is that one must recompute w^" at each stage in the clustering using a 
new d™'" (see Eq. 2.3), whereas ordinary anti-fc^ need only compute the smallest distance at each step. 
Because of this extra information required, we cannot exploit without modification the computational 
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geometry techniques [27] which makes fastjet fast. The result is that it can take tens of seconds per 
CPU to run 100 iterations on a event with several hundred particles. This is more of an inconvenience 
than a problem at the current time. Nevertheless, it would be nice to speed Qanti-fcr up. 

While our unoptimized implementation (available at http : // j ets . physics . harvard . edu/Qantikt) 
is fast enough for practical use there are a few methods one can employ to speed it up. These include: 

• Preselection: To avoid unnecessary computation it can be helpful to first require all events 
pass a loose set of cuts using classical anti-fc^ jet before running anti-ZcT non-deterministically. 
This can significantly reduce the number of events processed. For instance, if one is interested in 
a computation of dijet invariant mass for all events with satisfying px > 100 GeV one might first 
apply a preselection cut requiring all events have classical anti-fcr jets which satisfy pt > 75 GeV. 

• Limited mergings: Rather than computing the distance between each pair of four-momenta 
one can make the physically motivated assumption that a pair of particles further apart than 
some large distance (say, AR > 2.0) are unlikely to be part of the same jet. Such pairings can 
be excluded from the analysis to improve the algorithm execution time. 

• Preclustering: The runtime scales as n^lnn for n the number of particles to cluster, so its 
performance is quite sensitive to the number of initial particles. An easy way to reduce the 
number of particles used as input to the algorithm is to first cluster them into larger micro jets 
or into a coarser grid. For instance, if one finds that jets of (50 x ^77 = 0.1 x 0.1 yield an algorithm 
which is too slow, one can merge these into 5(t>x5rj = 0.2 x 0.2 cells to realize a 0(16x) speedup. 

• Optimization: Since much of the distance information is reused from iteration to iteration, 
there is plenty of potential to speed up the analysis by not recomputing these distances each 
iteration. More generally, smarter programming should speed up the algorithm significantly, as 
in fastjet [27]. 

• Modification: Our non-deterministic anti-fc^ algorithm is in a sense the simplest way to apply 
Qjets at the event level. One can easily conceive of other methods which might be better suited 
to speed-up. A promising approach which just clusters once and then varies the jet size R is 
discussed in [28]. 

8 Conclusion 

In this paper, we have presented a fundamentally new way to think about events with jets. Traditional 
algorithms, such as anti-fcT, give a single interpretation of an event. This interpretation can be thought 
of as a best guess at the assignment of particles into jets. These jets are meant to represent which 
particles came from the showering and fragmentation of which hard particle. In many events, however, 
there can be significant ambiguities in which particle belongs to which jet. These ambiguities show up, 
for example, in how different jet algorithms or jet sizes can give vastly different results for infrared safe 
observables. The problem is that each algorithm gives a single best guess no matter what - ambiguous 
events and unambiguous events are treated the same way, and all information about the ambiguity is 
lost. In other words, an event which is clearly signal-like by some measure is given the same influence 
over the results as an event which is marginally signal-like (in the sense that it would no longer be 
signal-like under a small change of parameters) . 

The idea behind Qjets, which we have used here on the the event level, is that the ambiguity 
provides useful information about an event. By making a jet algorithm non-deterministic, we can 



18 



compute the distribution of interpretations around the classical interpretation via Monte-Carlo sam- 
pling. When a non-deterministic jet algorithm (for example the Qanti-fc^ algorithm we present here) 
is run 100 times on an event the event 100 different interpretations result. The larger the variation 
in these interpretations, the more ambiguous an event is. We introduce a parameter a, called rigidity 
after a similar parameter in [10], which interpolates between classical anti-fcy (« = oo) and purely 
random clustering (a = 0). 

There are many ways that an ensemble of interpretations can be used. The simplest way is to 
construct a Q-observable such as the variance of some classical observable (like the jet mass) over the 
interpretations. An example of this approach is the volatility variable introduced in [10]. One can 
then cut on this variance to improve significance. However, since almost all events are signal-like to 
some extent, it makes more sense to include all the events in the analysis, with a weight based on the 
fraction interpretations passing a set of cuts. We derive a formula for the significance using weighted 
events which can be used to incorporate information from all the interpretations of all the events, 
rather than cutting some events out all together. 

We applied Qanti-fc^ to a number of types of events. We find that unambiguous processes, like 
those which produce hard and well-separated jets, do not benefit much from this procedure. However, 
for more complicated processes, such as those with softer or overlapping jets, the significance can be 
improved significantly. In a toy example, we showed that pair production of dijet resonances one can 
realize a 49% improvement in S/5B. 

Using weighted events from multiple interpretations has the potential to improve substantially 
searches for the Higgs boson and measurements of its properties. We found that for pp -^ ZH — > vvbh 
events at the 8 TeV LHC with pf > 120 GeV, one can realize an 28% improvement in significance over 
an equivalent classical analysis. We chose this pr fairly arbitrarily. With a pr cut less than 120 GeV 
and we still expect Qanti-Zc^ to improve significance, although perhaps not by as much. That is, the 
methodology of using multiple interpretations not restricted to the highly boosted regime, as are other 
approaches to finding the Higgs in this channel [24]. For other Higgs associated production channels 
(such as pp — >■ WH and pp — )■ tiH) with H — )■ 56, we expect the Qjets framework to be similarly 
helpful. 

The Qanti-ZcT- algorithm introduced in this paper be used whenever ordinary anti-fer is employed. 
While more complex event topologies tend to benefit more, Qanti-ZcT will at least never make an 
analysis worse. Indeed, since for a — oo, Qanti-fcy reduces to ordinary anti-fcT, as long as one scans 
over a, no harm can come (other than wasting time). It is natural to consider applying Qanti-fc^ or 
some variation within the Qjets framework to challenging processes, such as top-tagging. When tops 
are very boosted, it is likely that substructure methods will work better [29] (although merging Qjets 
with substructure is also promising), however, in the intermediate regime [30] with moderate boost 
Qjets could help a lot. It would also be interested to see if Qjets can help with color flow [31, 32], 
quark and gluon discrimination [33, 34], ISR tagging [35] or in any situation where ambiguities in 
reconstruction are problematic. 
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